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1 .  INTRODUCTION 

A  method  is  presented  which  expands  the  class  of  min-max  problems 
which  can  be  solved  using  an  algorithm  programmable  on  a  digital  computer. 

The  min-max  problem  deals  with  determining  u*  and  y*  such  that  J(u*,y*) 

=  min  max  where  U  and  Y  are  convex,  bounded  Hilbert  spaces. 

Ue-U  y*Y 

This  type  of  problem  can  occur  quite  frequently  when  formulating 
mathematical  models  of  physical  systems.  A  quality  of  the  system  such  as 
cost  or  position  of  some  object  can  be  represented  as  the  variable  J  which 
we  call  a  performance  index. 

Now  J  is  a  function  of  some  variables  represented  by  the  vector  u, 
which  the  designer  can  control  or  specify,  but  J  is  also  a  function  of  other 
variables  y  which  cannot  be  specified  and  which  are  known  only  to  lie  within 
a  given  range.  These  variables  may  be  unmeasurable  and  if  they  vary,  it  is 
assumed  that  thecha.igr  is  slow  enough  so  that  they  may  be  considered  time 
invariant  while  the  control  is  being  applied. 

tfhen  faced  with  unknown  variables,  an  optimally  adaptive  controller1 
which  combines  estimation  and  optimization  could  be  used;  however,  parameter 
estimation  may  employ  systems  with  high  gains,  which  are  usually  susceptible 
to  additive  noise.  An  optimal  controller  in  a  noisy  system  may  not  keep  the 
performance  index  near  the  minimum  as  in  the  noiseless  case. 

The  difficulties  involved  in  parameter  estimation  in  tl.e  presence 
of  noise  leads  one  to  other  approaches  to  the  probelm.  One  such  approach 


is  to  find  the  control  which  minimizes  the  maximum  value  of  the  performance 


index  as  a  function  of  plant  parameters 


There  has  been  considerable  theoretical  development  of  the  min-max 

2  3  4  5  6  7  8 

problem  *  ’  and  the  author  is  aware  of  four  algorithms  ’  ’  ’  which  solve  a 

min-max  problem.  These  algorithms  treat  the  minimizing  variable  u  as  a  vector 

in  a  finite  dimensional  Euclidean  space.  The  space  Y  is  a  finite  set  of  points 

and  y  is  an  element  from  this  space.  The  space  Y  can  be  considered  a  definite 

set  of  values  which  may  be  assumed  by  the  unknown  parameters. 

This  paper  is  concerned  with  expanding  the  algorithms  of  6  and  8 
to  deal  with  the  class  of  problems  where  the  minimizing  variable  is  a  vector 
in  a  function  space. 


u(t) 


ux(t) 

u2(t) 


u  (t) 


where  t  <  t  <  T. 
o  —  — 

In  other  words,  u(t)  is  the  control  which  can  change  with 
For  example,  consider  the  system  with  the  state 


time. 

equation: 


=  Ax  +  Bu 


and  a  performance  index 

J  =  x' (T)Qx(T)  + 


(T)Hu(T)dT. 
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i 


Assume  H  is  a  diagonal  matrix  and  is  not  a  function  of  time.  Also, 


assume  matrices  A  and  B  are  constant  over  time,  and  assume  that  some  of  the 


elements  in  A  are  unknown,  hut  lie  within  a  given  range.  Furthermore,  assume 

that  each  unknown  element  in  A  has  K  values  where  K  denotes  that  the  ith  element 

l  t 

r 

has  k  values.  Then  if  there  are  r  such  elements,  we  have  a  set  of  tt  K  points. 

Z=1  1 

r 

Let  N  Then  the  set  of  N  points  constitutes  the  maximizing 

space  of  the  min-max  problem 


min  max  J. (u) 

u  iel,2...N  1 

where  J^(u)  is  the  performance  index  when  A  assumes  the  ith  value  in  the  set 


<VA2' 

4 

Demjanov  developed  a  Taylor's  series  expansion  of  the  max  function 
for  the  Euclidean  or  parameter  space  problem.  This  expansion  for  Hilbert 
spaces  is  presented  in  Chapter  2.  Also,  from  the  theoretical  development  in 
his  paper  one  can  construct  a  gradient  algorithm. 

According  to  the  algorithm  by  D.  M.  Salmon^  one  should  start  with  a 
subspace  k ’  consisting  of  two  elements  from  the  original  max  space.  Then  find 
the  u*  which  solves  the  mi^-max  problem  consisting  of  the  original  min  space 
a~d  the  new  max  space  V'.  With  tnis  u*  find  the  element  i'  in  the  original 
max  space  which  maximizes  the  performance  index.  Then  add  i'  to  V'  and 
repeat.  The  stopping  criteria  will  not  be  presented  here;  the  reader  is 
referred  to  the  paper. 

Finding  the  value  of  u*  for  the  nun-max  problem  with  V'  as  the  max 
space  is  naturally  a  min-max  problem;  therefore,  this  is  not  a  complete 


algorithm. 


4 
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The  Newton-Raphson  method  by  J,  Medanic  could  be  used  to  solve  this  •  • 

sub  min-max  problem.  If  the  performance  index  is  quadratic  in  the  minimizing 
variable  and  the  space  V’  is  small,  then  the  solution  could  be  found  in  one 
step  However,  a  combination  of  a  gradient  method,  as  the  one  presented  by 
J.  heller^,  and  the  Newton-Raphson  method  will  be  more  efficient  than  Salmon's 
method  and  and  the  Newton-Raphson  method.  The  reason  for  this  is  that  the 
space  V'  after  several  iterations  will  be  larger  than  necessary,  and  because 
of  this,  the  Newton-Raphson  method  will  become  very  slow.  For  this  reason  we 
will  not  employ  Salmon's  algorithm. 

The  algorithm  employing  an  elimination  method  by  J.  Medanic^  theo¬ 
retically  cannot  be  used  for  solving  problems  in  function  spaces.  In  this 
algorithm  one  begins  with  a  space  bounded  by  hyperplanes.  At  an  interior 
point  of  this  space  one  can  construct  a  hyperplane  which  divides  the  space  into 
two  parts.  We  are  able  to  determine  that  the  min-max  does  not  exist  on  one 
of  the  two  parts.  We  then  can  eliminate  this  half  from  the  space  being 
considered.  One  proceeds  in  this  manner  until  the  space  that  has  not  been 
eliminated  consists  of  a  very  small  volume. 

The  main  difficulty  with  trying  to  use  this  algorithm  to  solve 
problems  in  function  spaces  is  that  it  is  impossible  to  enclose  any  region 
large  or  small  with  a  finite  number  of  hyperplanes.  To  enclose  an  area  with 
an  infinite  number  of  hyperplanes  would  take  an  infinite  amount  of  time. 

Therefore,  we  will  use  a  gradient  method  until  we  get  close  to  the 
min-max  and  then  use  a  second  order  method  to  reach  the  min-max. 


Let  us  consider  for  a  moment  minimization  problems  where  the  function 
is  continuous  and  smooth.  The  min-max  problem  does  not  fall  in  this  class 
because,  in  general,  the  max  function  is  not  smooth.  The  Newton-Raphson  method 
which  we  may  alternately  call  the  second  order  method  or  the  quadratic  method 
is  useful  only  in  regions  in  which  the  function  is  for  practical  purposes 
quadratic;  that  is,  the  function  may  not  be  quadratic  in  larger  regions,  but 
in  a  small  region  near  the  minimum  it  may  be  for  all  practical  purposes  quadratic 
and  the  quadratic  method  will  be  useful  in  this  region.  In  problems  where  the 
function  is  not  quadratic  except  near  the  minimum  it  is  generally  desirable  to 
use  some  form  of  gradient  method  until  you  are  near  the  minimum  and  then 
switch  to  the  quadratic  method.  The  quadratic  method  will  give  a  direction 
and  step  size,  but  if  the  function  is  not  quadratic,  the  step  may  be  larger 
than  one  that  decreases  the  function.  An  acceptable  procedure  then  would  be 
to  use  the  direction  found  but  decrease  the  step.  However,  since  each  iteration 
in  a  quadrtic  method  is  more  complex  than  that  of  a  gradient  algorithm,  it  is 
more  profitable  to  use  a  gradient  method  until  one  is  near  the  minimum. 

The  mii-max  case  is  similar  and  a  gradient  method  should  be  used 
before  a  quadratic  method  is  used.  However,  even  if  each  function  in  (1.1) 
is  q  .adratic  a'.d  the  number  of  points  N  in  the  max  space  is  large  (10  or  more 
points  can  be  considered  large),  then  we  will  show  that  it  is  still  advisable 
to  use  a  gradient  method  first. 

We  will  briefly  look  at  what  is  contained  in  Chapters  2  through  7. 
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Chapter  2  is  concerned  with  finding  the  first  and  second  terms  of  the 
Taylor's  series  expansion  of  the  max  function  in  Hilbert  spaces.  This  is 
necessary  before  developing  a  gradient  and  second  order  algorithm. 

Chapter  3  deals  with  the  basic  proofs  and  methods  necessary  to 
develop  a  gradient  algorithm  in  Hilbert  spaces. 

Chapter  4  presents  the  basic  proofs  and  methods  for  the  second 
order  algorithm  in  Hilbert  spaces. 

Chapter  5  shows  how  to  represent  the  gradients  and  other  expressions 
derived  in  Chapters  3  and  4  in  function  spaces,  a  type  of  Hilbert  space. 

Chapter  6  presents  some  experimental  results  and  discusses  possible 
difficulties  with  the  gradient  and  second  order  algorithms  and  discusses  the 
problems  encountered  when  the  gradient  algorithm  was  programmed. 

Chapter  7  describes  the  detailed  steps  in  constructing  the  gradient 
and  second  order  algorithms  for  solving  the  min-max  problem  in  function  spaces. 


i 
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2.  TAYLOR'S  SERIES  EXPANSION  OF  THE  MAX  FUNCTION 

Before  we  develop  the  first  and  second  order  algorithms,  ve  must  be 
able  to  find  the  first  and  second  terms  of  the  Taylor's  series  expansion  of 
the  max  function.* 

Now,  even  if  the  functions  J^(x)  are  continuous  and  smooth,  the  max 

function  J (x)  =  max  J  (x)  is  continuous  but  not  necessarily  smooth. 
i«l,...N  i 

Therefore,  J (x)  is  not  Frechet-differentiable,  but  we  will  show  that  it  is 
Gateaux-dif ferentiable. 

Minimize  the  functional 

F(y)  =  max  f  (y)  (2.1) 

U1.2...N  1 

where  f  (y)  is  a  real,  functional  defined  over  a  subset  S  of  a  Hilbert  space  H. 

S  is  closed  and  bounded  and  is  Frechet-differentiable  L  times  where  1  <  l  <  06 • 

^^(y) 

Tne  nth  Frechet  differential  in  the  direction  g  is  denoted  -  .  Then  we 

s.  n 

dg 

can  write  the  following  Taylor’s  series  expansion  of  f  (y)  in  the  direction 
gcH  (||g||  <  M  <  oo)  and  with  step  size  or  such  that  (y+ofg)eS 

l  k  dkf .  (y)  . 

f.Cy^g)  =  f.(y)  +  I  Jr  - V~+0i(l“l)  (2'2) 

k=l  '  dg 

where 

0  (|or|)A 

— — r  ■* 0 

jo|  a  -*  0 

*  A 

This  chapter  follows  the  development  of  Demjanov  but  is  in  Hilbert  spaces. 
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The  notation  1,N  designates  the  index  set  {l,2,...N}. 

Consider  F(y)  =  max  f.(y). 

ifil.N  1 

We  will  show  F(y)  is  Gateaux-differentiahle. 

First  we  must  define  R(y,e)  and  P(y).  Let 

R(y,e)  =  (i|iel,N,  F(y)  -  f.(y)  <  «} 

P(y)  =  {i|i«l,N,  F(y)  =  f.(y)}. 

Now  since  {l,N}  is  a  finite  set,  there  exists  a  6,  such  that  for 
all  e  <  6  R(y,e)  =  P(y). 

Now  for  any  two  finite  sets  of  real  numbers  A  and  B,  with  elements 

denoted  A(i)  and  F(i)  respectively,  then 

max  {A(i)  +  B(i)}  >  max  A(i)  +  max  B(i)  .  . 

i«l,N  id.N  i«P  U",; 

where 

P  *  {i|iel,N,  A(i)  =  max  A(j)} 

jcl.N 

Proof:  For  any  i'gl,N 

max  {A(i)  +  B(i)}  >  A(if)  +  B(i').  (2.4) 

i*l,N 

Now  (2.4)  is  true  for  i'gP,  but  for  such  i',  A(i')  =  max  A(i) 

is  1  ,N 


max  {A(i)  +  B (i ) }  >  max  A(i)  +  B(i')  (2.5) 

i«l,N  it  1 ,N 

and  (2.5)  is  true  for  any  i'«P  including  max  F(i). 

ieP 

max  {A(i)  +  B(i)}  >  max  A(i)  +  max  B(i) 
igl.N  iel.N  itP 


Q.E.D. 
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Now  from  (2.2)  and  (2.3)  ve  have 

l  k  b\(y)  , 

F  (y+org)  =■  max_f  (y-ftjfg)  =  may  £t.  (y)  +  T  rr - r — +  0  (or  )} 

i«l,N  icl.N  1  k=l  R*  3g*  1 


X  k  S  f.(y)  , 

>  max  f .  (y)  +  max  [  E - r -  +  0 .(<*)].  (2.6) 

icl.N  1  icP(y)  k=l  k!  Sg 


On  the  other  hand,  since  the  f^'s  are  continuous,  for  any  c  >  0, 
there  exists  an  >  0  such  that  if  cra[0,cr^J  then 


F  (y+org)  ■  max  f  (y+org)  -  max  f.  (y+org).  (2 .7) 

icl.N  icR(y.e) 


Now  c  must  be  large  enough  so  that  R(y,e)  contains  all  points  which  maximize 
F (y)  within  the  sphere  with  the  center  at  y  and  a  radius  of  ||ag|| . 


F  (y+org) 


From  the  triangle  inequality  we  have 

1  ak  dkfi(y>  a 

max  f  (y+org)  <  max  f .  (y)  +  max  [  V,  — - r —  +  0  (or  )] 

ieR(y,e)  icl.N  1  ieR(y.c)  k=l  ’  dg 

(2.9) 


Thus  F(y+org)  is  between  the  quantities 


r  *  „k  * \  (y)  , 

r(y)  +  max  L  5]  - -  +  0  (cr  )1  an<* 

icP(y)  k-1  k!  *ek  1 


k  a  ft(y) 


F<y> +  iSS^,.)[k^Tir- +  0ifa>]- 


As  we  stated  previously,  if  c  is  sufficiently  small,  i.e.  e  <  6,  then  R(y,c) 

=  P(y).  However,  if  we  desire  an  e  that  small,  then  the  sphere  with  the  center 
at  y  and  a  radius  ()arg|[  must  not  contain  any  other  maxima  besides  those  in 


10 


P(y) ,  i.e.  0/  <  Of*. 

Therefore,  when  a  <  Of*  and  e  <  6  then 


i  k  d  f .  (y)  , 

F(y-K*g)  =*  F(y)  +  max  [  E  7T - £ - +  0t(<*  )]  (2.10) 

ieP(y)  k*l  bg 


and  the  directional  derivative  of  F  is 

of  afi(y) 

lim  F(y+0fg)  -  F (y)  =  max  - - 

a  -»  +  0 - isP(y) 
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3.  GRADIENT  METHOD 

In  this  chapter  we  will  present  the  necessary  facts  and  methods 
for  developing  a  gradient  algorithm  for  solving  the  min-max  problem  in  Hilbert 
spaces . 

3.1.  Development  of  Gradient  Algorithm  and  Necessary  Conditions  for  Solution 

y*  is  a  local  minimum  of  F(y)  =  max  f  (y)  if  there  exists  some 

isl.N  1 

e  >  0  such  that  for  all  yeS  and  satisfying  ||y-y*||<  s  then  F(y*)  <  F(y).* 

For  a  gradient  algorithm  we  assume  f ^ (y)  has  a  continuous  Frechet 
derivative.  We  denote 

afi(y)  < 

1T”‘  <Pi(y)’8^ 

Proposition  1:  A  necessary  condition  for  y*  to  be  a  local  min-max 

solution  is  that 

max  <p. (y),g)  >  0  all  geS. 
icP(y) 

Proof:  Assume  that  for  g' 

max  <p  (y) ,g' >  =  k  <  0. 

icP(y) 


This  chapter  rollows  the  development  of  Chapter  3  in  Heller's  paper  ,  but  is 
in  Hilbert  spaces  where  his  is  in  Euclidean  spaces.  Most  of  the  propositions 
and  theorems  are  easily  extended  with  the  exception  of  Proposition  5  and 
Theorem  1  in  this  paper  which  corresponds  with  Proposition  5  in  Heller's.  In 
this  case  the  proof  in  Hilbert  spaces  was  considerably  more  involved. 
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Then  there  is  an  a  sufficiently  small  to  satisfy  the  requirements 
for  equation  (2.11)  and  where 


|  ot<Pi  (y)  , g '  >  |  >  |  01(o?)j  for  ieP(y)  . 

Therefore  F(y+ag')  <  F(y)  and  y  is  not  a  local  minimum. 

Define 

c(y)  =  {y|<Pi(y).v>  <  o}. 
ieP(y) 


Q.E.D. 


Proposition  2:  If  at  a  point  yeS  there  exists  a  YeC(y),  then  there 
exists  an  a  >  0  such  that 

F(y+av)  <  F(y) . 

The  proof  follows  from  the  above. 

Based  on  Proposition  2  a  procedure  can  be  formulated  which  converges 
to  points  satisfying  Proposition  1.  Assume  yQ  is  an  arbitrary  point  in  S  and 
define  a  sequence  in  Sfy^}  by 


rn+l 


y_  +  of  Y 

n  n  n 


(3.1.1) 


where  Yn6c(yn)»  assuming  C(y  )  is  not  empty  and  cr^  satisfies  Propostion  2.  If 

C(y  )  is  empty  for  any  n,  the  corresponding  y^  satisfies  Proposition  1.  The 

sequence  F(y  )  is  a  strictly  decreasing  one  and  is  bounded  by  min  max  f . (y) 
n  ycS  iel.N  1 

The  sequence  F(y  )  therefore  converges  to  a  limit.  As  S  is  closed  and  bounded, 


{y^}  must  also  converge  to  a  limit  point  y*  which  satisfies  Propostion  1.  If 


the  point;  y*  did  not  satisfy  Proposition  1,  it  would  contradict  the  fact  that 


13 


F(y*)  is  the  limit  of  F(yn)  which  follows  from  the  continuity  of  F.  This  can 
be  summarized  in  the  form  of  a  proposition. 

Proposition  3.  The  sequence  of  admissible  points  in  S,  defined  by 
equation  (3.1.1)  converges  to  a  point  y*  which  satisfies  the  necessary  condi¬ 
tions  for  a  local  min-max  solution. 

Convergence  to  a  point  which  satisfies  the  necessary  conditions  for 
a  local  min-max  solution  is  guaranteed.  It  is  necessary  to  know  that  a  pro¬ 
cedure  eventually  converges,  but  to  be  useful  we  must  have  a  termination 
criterion  which  indicates  when  a  point  yn  is  in  the  neighborhood  of  the 
solution  y*.  A  neighborhood  termination  criterion  will  be  presented  in  the 
next  section. 

3.2  Termination  of  the  Algorithm 

It  is  desirable  to  terminate  the  sequence  when  a  point  yn  is  in  the 
neighborhood  of  a  point  satisfying  Proposition  1. 

Define : 

ck(y)  =  [yeC(y)  Kp^'O.y)  <  -k} 

where  k  >  0. 

Proposition  4:  The  set  C^fy)  is  empty  if  and  only  if  the  set  C(y) 

is  empty. 

Proof :  If  C<y)  is  empty,  then  C^Cy)  must  be  empty  since  it  is  a 
subset  of  C(y).  Assume  C^ty)  is  empty  and  C(y)  is  not. 

Then  there  exists  an  y^eC(y)  such  that 

<Pi(y)»v1>  <  -b  <  0 
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but  then 


■g  <Pi(y),v1>  <  -k 


therefore,  —  YjGC^y)  which  is  a  contradiction. 


Define 


c®(y)  *  {y|<pi(y)»Y>  <  -k) 


Q.E.D, 


ieR(y,e)- 

Let  ||y*||  be  the  minimum  element  of  C*(y);  then  as  a  sequence  of  points  [y  } 

k  n 

approaches  the  min-max  solution  y*,  ||y*||  approaches  infinity.  In  order  to 
prove  this  we  use  Proposition  1,  i.e., 


max  <p.(y*),g)  >0  all  geS. 
ieP(y*) 


(3.2.1) 


However,  in  order  to  use  this  information,  the  point  i*eP(y*)  which  maximizes 
(3.2.1)  must  be  included  in  R(y-h*g,c).  Therefore,  we  must  show  the  following: 
Proposition  5:  For  any  e.  there  exists  a  6  such  that  if 
a  <  6  and  ||g||  =  1  then 


P(y)  c  R(y  +  Qfg ,  s) 


(3.2.2) 


Proof:  Since  f.(y)  is  continuous  for  all  i  there  are  6.'s  such  that 

-  l  l 


if  or.  <  6  .  ,  then 
l  —  l ' 


f i  (y  +  ttj^g)  -  ft(y)  j  <  e/2 


Icl.N 


and 
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| f ± (y  +  atg)  -  F(y) |  <  e/2 
ieP(y) 

since  f^(y)  *  F(y)  for  igP(y). 

Let  6'  ■  min  6,.  Now  if  a'  <  6'  then 
ieP(y) 

I^Cy  +  a'g)  -  F(y)  |  <  s/2 
ieP(y) . 

We  have  shown  F(y)  to  be  Gateaux-dif ferentiable  and,  hence,  continuous. 
Therefore,  there  is  a  6q  such  that  if  <  6Q,then  |F(y  +  aQg)  -  F(y)|  <  e/2. 

Let  6  =  min  [6' »6Q}  and  let  a  <  6 • 

Then 

(fi(y  +  ag)  -  F(y  +ag) \  <  e  isP(y).  (3.2.3) 

But  all  i's  satisfying  (3.2.3)  are  elements  of  R(y  +  org.e).  Therefore 
(3.2.2)  follows.  Q.E.D . 

Now  we  are  able  to  give  the  condition  that  indicates  we  are  approach¬ 
ing  the  min-max  solution.  Simply  stated,  the  condition  is  that  the  minimal 
element  ||y*j|  of  C*(yn  +  ag)  approaches  infinity  as  y ^  approaches  y*.  Or  more 
formally,  we  have: 

Theorem  1:  For  any  e  and  for  any  N,  there  is  a  6  such  chat  if  y*  is 
a  local  min-max  solution,  then  if  a  <  6  and  ||g||  <  1,  then  the  minimum  element 
||v*||  of  C®(y*  +  ag)  is  greater  then  N. 
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Proof:  Since  y*  is  a  local  min-max  solution,  then  for  some  i  denoted 


and  since 


(pj. (y*)»y*>  >  o 


P(y)  <=  R(y  +  Qfg.e) 


(3.2.4) 


then  i'eR(y  +  org.s)  and  by  the  definition  of  y*  we  have 

(Pi<  (y*  +  org) ,  Y*>  <  -  k  for  a  <  6  . 

Now  add  (3.2.4)  and  we  get 


(3.2.5) 


and  let 


<Pif(y*  +  Q-g)  -  Pi,(y*),  y*)  <  -k 


b  =  Pif  (y*  +  arg)  -  Pit  (y*) . 

Since  we  assumed  f^(y)  to  have  a  continu jus  derivative,  i.e.  p^fy)  is  con¬ 
tinuous,  then  for  any  k  and  any  N  there  is  a  6 '  such  that  if  ||org||  <  6'  then 

II  HI  5  |  • 

Now  we  know  from  the  Triangle  Inequality, 

I  <b,y*>l  <  INI  !Iy*II  • 

But  from  (3.2.5)  |(b,Y*)|  >  k 

.*•  II  HI  IMI  >  k 

IHI  >imr 


||y*||  >  n. 


Q.E.D. 
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3.3  The  Direction  to  Move 

£ 

From  Proposition  2  and  from  the  fact  that  C^(y)  is  a  subset  of  C(y) 
we  <now  that  any  V  which  is  an  element  of  c£(y)  will  be  a  suitable  direction 
which,  with  a  small  enough  step,  will  decrease  the  max  function.  However, 
we  may  wish  to  travel  in  the  direction  of  greatest  descent.  That  is,  if  0(y) 
is  defined  to  be 

0(y  +  y)  *  F(y)  +  max  (p.(y),y)  (3.3.1) 

ieR(y.e) 

then  we  want  to  find  the  Y  denoted  y*  to  be  that  direction  which  minimizes 

‘><y  +  n%)- 

0(y  +  )  ■  min  0(y  +  ^  ) .  (3.3.2) 

Ml  Y«C*k(y)  M 

£ 

Theorem  2:  If  Y*  is  the  minimal  element  of  C^(y)  with  respect  to 
the  norm,  then  Y*  satisfies  (3.3.2). 

Proof:  The  maximum  entry  of 
<Pi(y)»Y*> 

is  -k  where 

i«R(y,e) • 

Assume  for  an  arbitrary  ygC*(y)  the  maximum  of  (p^(y),Y)  is  -b  where  icR(y,e) 
and  -b  <  -k. 

The  max  ofvp^y),  -  y)  is  -k  and  -  y  is  an  element  of  Ck(y).  There¬ 
fore,  we  have: 

max  <p.  (y),  -X—)  -  -  — — 

i«R(y,«)  ||Y*||  Ml  (3.3.3) 

and 


u 
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max  <p.(y),  =  — 

i«R(y,«) 


(3.3.4) 


And  by  assumption  chat  ||y*||  is  the  minimal  element  of  C®(y), 


|v*li  <  =  I 


or 


nw  -  m  ■ 


(2.3.5) 


Therefore,  from  (3.3.3),  (3.3.4),  and  (3.3.5)  we  have 


Y*  Y 

max  <p  (y),  — 1 —  <  max  <p  (y) ,  L~ 
ieR(y.e)  ||y*||  icR(y,e) 


(3.3.6) 


and 


From  (3.3.1)  we  have 
j * 


0( y  +  tt^-tt)  “  p(y)  +  max  <p,(y)i 

I'Y*11  i«R(y,e)  1 


0(y  +  -^-)  =  F(y)  +  max  (p  (y) ,  . 


isR(y,e) 

From  (3.3.6),  (3.3.7),  and  (3.3.8)  we  have 
0(y  +  — “— -)  <  0(y  +  )  • 

Ml  "  IMI 

Put  another  way  we  have 

0  (y  +  — ^ — )  =  min  $(y  +  -*—)  . 

IMI  Y«C*(y) 


(3.3.7) 


(3.3.8) 


Q. E.D. 


ly 


3.4  Implementation  of  the  Algorithm 

We  have  shown  the  direction  of  steepest  descent  is  equal  to  the 

6  f 

minimum  element  of  C^(y);  however,  finding  the  minimum  V*C^(y)  with  respect 

to  the  norm  is  a  minimization  problem  in  Hilbert  spaces.  In  most  problems 
where  the  Hilbert  space  is  not  the  Euclidean  space,  this  algorithm  would  not 
be  suitable.  However,  the  problem  can  be  transformed  so  that  the  minimization 
is  carried  out  in  a  Euclidean  space.* 

JL 

We  wish  to  find  min(Y,Y)*  such  that 


if  y'  minimizes 
minimize  (y,y) 


<P1(y).Y>  <  -k 


(3.4.1) 


i«R(y,s) 

(3.4.1)  it  minimizes  (y,Y>  subject  to  the  same  constraint, 
such  that 


<Pi(y)»y>  <  -k 

ieR(y,s) . 


9  ~ 

From  the  Kuhn-Tucker  theorem  y  represents  a  solution  to  (3.4.1) 
if  and  only  if  a  vector  u  exists  such  that 

Y(y,u)  <  Y(y,G)  <  Y (y ,u) 

where  u  >  0  and 

K 

Y(Y,u)  *=  <Y,Y)  +  Z  u  (p  (y),y>  +  k) 
j-1  3  3 


and  K  is  the  number  of  elements  in  R(y,c).  This  yields  the  Kuhn-Tucker  con¬ 
ditions  : 


This  section  deals  with  the  Hilbert  space  development  of  the  Method  of  Hildreth 


.  „)r.  11 
and  D  Espo 
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=  -k 

i«R(y,c) 

K 

2y  +  E  u  p  (y)  -  0 

j-1  3  3 


A 


(3.4.3) 


u'x  -  0 
u  >  0  x  >  0  . 


Consequently, 

1  K 

y  -  -  t  E  u  p  (y). 

Z  j-1  J  3 

Substitute  (3.4.5)  into  (3.4.2)  and  we  have 

1  K 

(PjCy).  -  T  s  u  p.(y)>  +  X  -  -k 
j-1  3  3 

ieR(y,e) 


This  is  equivalent  to 
1  K 

-  2  2  u«<pi(y)»p.(y)>  +  xi  *  _k- 

j-i  J  J 


(3.4.4) 


(3.4.5) 


(3.4.6) 


Equation  (3.4.6)  along  with  (3.4.3)  and  (3.4.4)  frdm  the  Kuhn-Tucker 
conditions  of  the  following  problem 

min  u'Gu  -  ku 
with  u  >  0 

with  the  matrix 

Gij  “  4  <Pi(y>>pj(y>>- 


(3.4.7) 


We  have  presented  all  the  necessary  operations  to  construct  an 
algorithm  employing  a  steepest  descent  search  and  a  stopping  criterion. 
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la  Chapter  7  ve  will  present  the  complete  algorithm  and  how  It  should  be  used 
in  conjunction  with  a  second  order  approach.  Therefore,  we  will  now  present 
the  second  order  algorithm.  It  will  be  considered  in  two  parts.  First,  we 
will  assume  that  each  function  f^,  where 

F(y)  -  max  f  (y), 
isl,N  1 

is  quadratic  in  y  and  that  the  total  number  of  points  N  in  the  max  space  is 
small.  Next,  we  will  consider  the  case  where  f^ (y)  is  not  quadratic  and/or 
the  number  of  points  N  is  large. 


4.  SECOND  ORDER  ALGORITHM 


In  this  chapter  we  will  present  the  necessary  facts  and  methods  for 
developing  a  Newton-Raphson  or  second  order  method  for  solving  the  min-max 
problem  in  Hilbert  spaces.* 

4.1.  The  Quadratic  Min-Max  Problem 

Consider  the  min-max  problem 

d*  =  min  F(y)  =  min  max  f.(y)  (4.1.1) 

y«Y  yGY  iel,N 

where  the  f^CyVs  are  quadratic  functionals  in  the  Hilbert  space  H,  i.e., 

ft(y)  -  +  <bi.y>  +  \  <gt(y)ty> 

g^(y)  is  a  linear  functional  of  y  and  (g^(y),y)  is  positive  for  all  y  +  0  and 
zero  for  y  ■  0. 

The  heart  of  the  second  order  method  is  that  the  max  function  can 

be  expressed  as  a  function  of  the  original  variable  and  a  vector  variable. 

N 

That  is,  F (y)  =  max  f.(y)  =  max  V.  c.f.(y)  =  F(c,y).  Also,  F(c,y)  possesses 
i«l,N  1  c*C  i- 1  1 

a  saddle  point  and  is  equal  to  the  minimum  of  F(y).  That  is, 

min  max  F(c,y)  =  max  min  F(c,y)  =  min  F(y). 
y  c  c  y 

This  can  be  expressed  in  the  following  two  theorems. 

This  chapter  follows  the  development  of  Medanic  except  in  Hilbert  spaces. 


1 

I 
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Theorem  1:  Let  the  max  function 

N 

Fm(y)  -  max  ^  c^y) 


(A. 1.2) 


where  c  -  (cL,...c  )  and 

N 

E  c  =1,  c  >  0  i  -  1,2, ...N 
i«l  1  1 

and 

F(y)  -  max_  f  (y).  (4.1.3) 

iel,N  1 

Then  for  all  ycH 

F(y)  -  Fm(y). 


Proof:  Let  c*  maximize  (4.1.2)  and  let  c'  be  such  that 


2  c.'f.  (y)  -  max  f  (y)  -  F(y). 
i=l  i-l,H  1 


Clearly 


Fm(y) 


N  N 

2  c*f  (y)  >  E  c'f  (y)  -  F(y) 
i-1  1  1=1  1  i 


by  the  definition  of  c*. 


On  the  other  hand,  since 


(4.1.4) 


f± (y)  <  F (y)  iel,N 


N  N 

Z  C*f  (y)  <  E  c*F(y)  -  F(y). 
i-1  i-1  1 


it  follows  that 
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Therefore, 


N 


F(y)  =  E  c*f  (y)  <  F(y). 
m  i-1  1  1 

From  (4.1.4)  and  (4.1.5)  we  have 


(4.1.5) 


Fm(y)=F(y).  Q.E.D. 

Corollary  1.  The  min-max  solution  of  the  modified  min-max  problem 

min  max  F(y,c)  where 
y  c«C  „ 

F(y,c)  =  Sc  f  (y) 
i=l  1 

is  equivalent  to  the  min-max  solution  of  the  original  problem  (4.1.1). 

Theorem  2.  The  modified  function  F(y,c)  possesses  a  global  saddle 

point,  i.e.,  there  exists  a  point  (y*,c*)  satisfying 

d*  =  min  max  F(y,c)  =  max  min  F(y,c). 
y  ceC  ceC  y 

Proof;  Ky  Fan^  has  shown  that  if  the  domains  of  y  and  c  are  convex 

and  if  F(y,c)  is  convex  in  y  and  concave  in  c,  then 

min  max  F(y,c)  -  max  min  F(y,c) 
y  c  c  y 


Hence,  we  must  show  that  the  above  conditions  are  satisfied.  The 

domain  of  y  is  not  restricted  and,  hence,  it  is  convex  while  the  domain  c 

is  convex  since  it  is  defined  as  the  intersection  of  two  convex  sets:  the 

r 

positive  hyperquadrant  c.  >  0  i  =  l,...r  and  the  hyperplane  £  c.  =  1. 

1  i  =  l  1 

Also,  since  the  c^  are  non-negative  and  (g^,(y),y)  is  positive  for 

all  y  f  0  then  £c^f^(y)  remains  quadratic  and,  hence,  convex.  And  since  the 

modified  cost  func*-iona'..  is  linear  in  c^  i  =  l,...r,  it  is  also  concave  in  c 

Q.E.D. 
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4.2  Second  Order  Algorithm  for  the  Convex  Min-Max  Problem 

Tvo  algorithms  are  possible  from  the  above  discussion,  where  one 
is  imbedded  in  the  other.  The  first  is  an  algorithm  which  solves  the  quadratic 
min~max  problem  where  the  number  of  points  in  the  maximizing  space  is  small. 
Since  the  min-max  and  max-min  operations  can  be  reversed,  it  is  necessary  to 
find  the  minimum  of  J(y,c)  with  respect  to  y  in  terms  of  c,  and  then  maximize 
the  resulting  function  of  c. 

This  will  yield  the  min-max  solution  with  one  maximization.  Since 
the  maximizing  space  c  has  dimensions  equal  to  the  number  of  points  in  the 
max  set  1,N  then  N  must  be  kept  relatively  small. 

If  N  is  large  and/or  the  functional  is  not  quadratic,  then  en 
alternate  procedure  must  be  used. 

We  will  consider  an  algorithm  that  makes  use  of  second  order 
variations. 

Assume  f^ (y)  is  convex  for  i*l,N  and  f ^ (y)  can  be  differentiated 

twice  for  all  points. 

Then  from  Section  2: 

F(y  +  erg)  *  max  f  (y  +  org) 

isl,N  2 

9f.  2  3  f  2 

F(y  +  erg)  >  F(y)  +  max  Tor  —  +  f-  — ~~  +  0  (a  ) ] 

ttPCy)  38  2  3g 

3f  (y)  2  J2f  (y) 

F (y  +  erg)  <  F(y)  +  max  Ter  — - - +  r - 5 - +  0  (cr  )]. 

i*R(y. e)  Sg 


and 
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If  f^(y)  is  quadratic  for  all  icl,N,  then  from  any  initial  point  we 
can  immediately  use  an  algorithm  minimizing  a  quadratic.  If,  however,  f^(y) 
is  not  quadratic  and  if  the  initial  point  is  a  large  distance  from  the  minimum, 
then  higher  order  terms  render  a  direction  found  to  be  no  more  useful  than  a 
direction  found  only  with  a  gradient  algorithm.  And  since  a  direction  obtained 
with  a  quadratic  minimization  would  consume  more  computer  time  than  the  gra¬ 
dient  method,:  then  the  gradient  method  should  be  used. 

It  would  be  advisable  to  use  the  gradient  algorithm  until  the  step 

size  becomes  small.  Therefore,  assume  f^(y)  is  quadratic  or  y^  is  close  to 

y*  so  that  thirc  or  higher  order  terms  are  negligible.  Then 

^  2  d  f-^y) 

min  F  (y  +  ag)  >  F(y  )  +  min  max  [a  r— - j -  1 

ag  °  ag  ieP(y)  8  "  dg 


^fi  a2  ^(y) 

min  F  (y  +  ag)  <  F(y  )  +  min  max  [a  r— +  “ - : - -] 


ag  ieR(y.s)  9y  2  dg2 


e  must  be  large  enough  so  that  R(y,e)  must  contain  all  elements  i' 
where  i'eP(y')  and  y'  is  within  the  sphere  with  y^  as  center  and  y'  -y  as 
radius . 

The  second  order  algorithm  is  as  follows: 

Find  a*g*  which  minimizes 

^f  (y  )  2  *2f  (y  ) 

r  i  n  ct  1  n  q 

max  C - TZ  +  - 2 — 

i*R(y,c)  *g 

Then  y  *  y  +  a*g*  and  F(y  should  be  less  than  F(y  ). 
n+1  n  6  7n+l  n 

If  it  is  not,  it  is  because  third  order  terms  could  not  be  neglected  or  e  was 

not  large  enough.  If  this  is  the  case,  find  a'  <  l  such  that 


F(y  +  a’or*g*)  <  r  ( y  )  . 
n  n 


'll 

Repeat  the  algorithm  with 

y  , ,  and  e  . ,  <  «  . 

Jn+1  n+1  n 

Terminate  the  algorithm  when  y  is  within  6  of  y*;  this  occurs  when 

m 

||ar*g*!l  <  6  and  e  <  «,  such  that  R(y  ,e)  -  P(y  ). 

11  —  “"i  m  in 
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5.  THE  MIN-MAX  PROBLEM  IN  FUNCTION  SPACES 


We  have  presented  the  gradient  and  Newton-Raphson  methods  for 
solving  the  min-max  problem  in  Hilbert  spaces.  Let  us  consider  a  particular 
space,  namely  the  function  space. 

Given 


F(u)  =  max  K. (u) 
iel.N  1 

where 

T 

K. (u) =  f  V(x.(t),u(t))dt 

1  Jo  1 

and  where 

x^(t)  =  f^(x^,u),  and  x(o>  =  x^ 

Therefore,  where  we  have  used  the  inner  product  we  have 

T  M 

<a,b>  =  J*  E  [a  (t)  b  (t)]dt. 

Oi=l  1 


(5.1) 


(5.2) 


(5.3) 


5.1  The  Gradient  Algorithm 

We  mus.t  know  how  to  calculate  the  gradient.  The  gradient  of  K^(u) 
is : 


grad  K^=  - 

7  H.  where 

U  1 

(5.1.1) 

H  =-  V (x . 

l  l 

u)  +  X '  f  (xi  ,  u ) 

(5.1.2) 

x.  =  V  H 

x(0)  =  x 

(5.1.3) 

i  X 

o 

X  =  -7  H 

V 

H 

N-/ 

II 

o 

(5.1.4) 

Therefore,  to  find  the  gradient  of  at  a  point  u,  solve  (5.1.3) 
for  x(t);  having  this,  solve  (5.1.4)  for  \(t).  Consequently,  we  can  find  -V^H. 


29 


We  can  now  use  the  gradient  algorithm  presented  in  Chapter  3. 

We  have  indicated  how  to  find  the  gradient,  namely  by  solving  two 
sets  of  n  differential  equations  where  n  is  the  order  of  the  system  and  sub¬ 
stituting  the  results  in  equation  (5.1.1).  The  gradient  is  not,  however,  the 
direction  to  move  unless  there  is  but  one  point  in  R(y,*). 

We  must  calculate  the  gradient  for  each  point  in  R(y,e)-  It  is  then 
possible  to  find  the  direction  to  move  which  is  the  minimum  element  of  C*(y). 
To  find  this  element  we  minimize 

u'  G  u  -  u  (5.1.5) 


with  u  >  0 


where  pj^  is  the  gradient  of  associated  with  the  ith  point  in  R(y,*).  Now 
the  direction  to  move  is 


V  = 


1 

2 


K 

Z  u  P  (t) 

j=l  J  J 


(5.1.6) 


We  can  then  find  a  suitable  step  size  and  repeat  the  process. 

Minimizing  (5.1.5)  is  an  iterative  process.  If  a  minimum  is  not 
found  within  a  reasonable  amount  of  time,  it  may  be  that  C^(y)  is  empty  and 
no  value  of  u  exists,  or  y  may  be  very  large.  Therefore,  calculate  y  and  if 
it  is  greater  than  some  N,  then  decrease  c  in  Cj^(y)  or  terminate  because  the 
solution  has  been  found.  If  y  is  less  than  N,  continue  the  minimization  process. 
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5.2  Second  Order  Algorithm 

In  Chapter  4  we  stated  that  we  must  find  the  minimum  of  J(u,c) 
with  respect  to  u  in  terms  of  c.  We  will  now  show  how  this  can  be  done. 

There  are  several  ways  in  which  to  solve  this  problem.  One  method 
is  as  follows: 

Let 


J(u,c) 


N  T 

ViKi<u> 


i=l 


N 

Z  c  V(x  ,u)dt. 
i=l 


This  can  be  rewritten  as 


where 


T  _  _ 

J(u,c)  =  f  V  (x,u,c)dt 

Jo 


X  * 

^  £N 

X  X  • 
i _ 

,  f(x,u)  * 

fl(Vu) 

.  XN  _ 

fN(xr,u) 

and  H  =  -V(x,u,c)  +  X'  f(x,u).  If  x^^  is  an  n  vector,  then  Y,  and  X  are  n-N 
vectors. 


If  f(x,u)  is  linear,  we  can  solve  the  Riccati  equation  and  find 

u*(c),  the  value  of  u  which  minimizes  J(u,c)  in  terms  of  c. 

However,  this  method  would  consume  a  considerable  amount  of  machine 

2 

time  and  storage  since  we  are  dealing  with  (N-n)  /2  simultaneous  differential 


a 

* 

u 

'  ;  I 

U  I 


equations  and  they  must  be  solved  many  times  because  J(u,c)  is  first  minimized 
in  terms  of  u,  then  maximized  in  terms  of  c  and  repeated  until  the  saddle  point 
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is  found.* 

Let  us  consider  another  method  for  finding  the  minimum  of  J(u,c) 
with  respect  to  u  in  terms  of  c. 

Let 

T 

K.  (“)  *  fv(x.(t),  u(t))dt.  (5.2.1) 

1  J0  1 

If  V  is  quadratic  in  u  or  if  higher  order  terms  can  be  dropped  then 

T  T  2 

r  <£  •“<'»<■' + \  i  ‘ 

T  T  ,2. 

F  X 

0  0 


T  T  2 

^(u)  =*  at  +  J*  (^—  *u(t))dt  +  j  J  u'  (t)  u(t)dt 

+  U'(t)  dl^Ijfeo  U(T)dtdT- 


(5.2.2) 


The  last  two  terms  in  (5.2.2)  result  from  the  fact  that  the  vector 
dV 

—  (x,u)  contains  u,  explicitly  and  implicitly.  Where  u  appears  explicitly, 

dV 

the  third  term  results;  where  u  is  implicit  in  the  fourth  term  appears. 
We  can  write 


d V  dV  .  d x  +  dV 


du(t)  dx  du(t)  du(t) 


Working  concurrently  but  independently  J.  Medanic  has  produced  an  as  yet 
unpublished  paper  dealing  with  the  quadratic  problem  in  function  spaces.  He 
has  proofs  similar  to  those  in  Section  4.1,  and  he  uses  the  method  employing 
the  Riccati  equation  described  above.  He  has  considered  an  example  where  the 
max  space  N  consists  of  two  points.  His  paper  does  not  present  the  method  of 
finding  the  minimum  of  J(u,c)  with  respect  to  u  in  terms  of  c  that  is  pre¬ 
sented  below,  nor  does  he  consider  the  gradient  method,  nor  does  he  deal  with 
the  problems  encountered  when  N  is  large  which  we  will  discuss  in  Chapter  6. 


2  2  2 

d  v  a  v  ax  a  v 

du2(t>  ia(c)iK  a»2«o 

d2v  _  ax1  a2v  ax  ax1  a2v 
du(r)du(t)  au(r)  2  au(t)  au(-r)  axdu(t) 

ox 


Now 


N 

j(u,c)  =  y,  c.k^u). 
i  =  l 


Find  the  gradient  of  J(u,c)  with  respect  to  u  and  set  it  equal  to  zero. 


£  -  0 

du 


and 


..  N  dK. 

—  =  E  c.  ~r~~~  ~  0 
du  .  ,  i  du 
i=l 


d2V. 


dK.  dV.  d  V.  T 

—  „  —L  +  u»  (t)  — -i  +  J  u'  (t)  .  ,  1  r-r  dT  =  0 

du  du  ,2  «  du(T)du(t) 

ClU  U 


Putting  equation  (5.1.4)  into  (5.1.3)  we  get 


N  dV. 


d"V. 


d2V. 


.v'cttdr'  +  u'(t)  ~t  +  J'nu’<T)  d'u (t ) du"c't y  dTi  *  °- 

i=l  du  0 


The  integral  equation  (5.2.5)  must  now  be  solved.  In  general. 


(5.2.3) 


(5.2.4) 


(5.2.5) 

this 


can  be  a  fairly  complex  problem.  We  will  now  consider  a  specific  example. 
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5.3  A  Specific  Example 

We  will  consider  the  problem  presented  in  Chapter  1. 
Given  a  state  equation 

x  =  Ax  +  Bu 


and  a  performance  index 


J  =  x'  (T)Qx(T)  +  J  u'(t)H  u(-r)dt  . 


Assume  H  is  a  diagonal  matrix  and  is  not  a  function  of  time.  Also 
assume  matrices  A  and  B  are  constant  over  time,  and  assume  that  some  of  the 
elements  in  A  are  unknown,  but  lie  within  a  given  range.  Furthermore,  assume 
that  each  unknown  element  in  A  has  K.  values  where  K  denotes  that  the  £th 

Xd  jL 

element  has  k  values.  Then  if  there  are  r  such  elements,  we  have  a  set  of 
r 

n  K  points. 

£= 1  1  r 

Let  N  =  tt  K  .  Then  the  set  of  N  points  constitutes  the  maximizing 
£=1  1 

space  of  the  min-max  problem 


min  max  J . (u) 
u  i*l ,N  1 

where  .1  (a)  is  the  performance  index  when  A  assumes  the  ith  vilue  in  the  set 


C A^ , A^ , 


•V' 


We  wish  to  demonstrate  the  gradient  and  second  order  algorithms  for 

/ 

this  problem.  Now  J  is  quadratic  in  u.  However,  assume  that  N  is  large; 
therefore,  we  do  not  want  to  generate  a  minimising  space  of  dimension  N. 

In  order  to  find  the  gradient  of  we  must  transform  J  so  that  J 


takes  the  form  of  (5.2). 
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x  =  Ax  +  Bu  x  (0)  =  x 


then  solve 

X  =  2Q  x^t)  -  A|X  with  X(T)  =  0 

We  will  now  present  the  main  computational  aspects  of  the  quadratic  problem. 


From  the  Taylor's  series  expansion  in  Chapter  2  we  have: 

dJ.  .  d  J , 

max  J.  (u  +  h)  >  max  J  (u  )  +  max  [(t- *  h)  +  — (h' — h)]  (5.3.1) 

tcl.N  iel,N  icP(a  )  du 

o 
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and 


dJi  1  d2ji 

“ax_J  (“  +  h)  <  max  J  (u  )  +  max  [<-r-,  h)  +  T<h' — 5-, 

i«l,N  1  iel.N  1  ieR(uo,6)  d  du 


h>]. 


(5.3.2) 


Now  we  wish  to  find  h  to  minimize 


max 

ieR(uo,e) 


c4r<uo>-h> + 


•>>3 


where  s  is  large  enough  to  include  all  points  in  the  sphere  with  uq  as  center 
and  || h||  as  radius. 


Now 


t 

x(t)  -  «(t,tQ)xo  +  J  $ (t ,t) Bu (T)dT 

o 


dJ. 
_ 1 

du 


.  =  2x(T)Q  +  2u^(t)H 

Uo  •=  2x(T)Q4(T,t)B  +  2u'(t)H 

o 


d2J. 


2B'i' (T , t)Q$(T,T) B  +  2H. 


We  have  now, 


T 

min  max  { J*[2x.  (T)Q4  . (T,T)Bh  +  2u'H  h]dT 

h  icR(uQ,e)  tQ  1  J 

T  I  T 

+  J  J  h'(t)B,»  (T,t)Q*  (T,r)Bh(T)dtdT  +  J  h’ (x)H  h(t)dt}. 
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Now  convert  to  a  modified  min-max  problem. 

N  T 

min  max  .£  c.{J  [2x.  (T)Q§.  (T,<r)Bh  +  2u^H  h]dT 

h  ceC  i=l  1  t  L  x  J 

o 


J.  J.  i. 

+  J  jp  h*  (t)B'i.(T,t)  Q$i(TjT)Bh(T)dtdT  +  |  h'  (t)Hh(t)dt)  . 


(5.3.4) 


We  can  interchange  the  max  and  min  operations  and  we  wish  to  find  the 
gradient  with  respect  to  h  and  set  it  equal  to  zero. 

N  T 

E  c. [2x. (T)Q$  (T,t)B  +  2u'H  +  2  f  h'B'*. (T,t)Q$ . (T.T)Bdt 

,,iii  o  J  1  1 

i=l  t 


+  2h’H]  =  0 


(5.3.5) 


We  can  write  this  as 

N  T  N 

Z  c  f  (t)  +  Hh(t)  +  P  Z  c.G. (t)K. (T)h(T)dT  =  0  (5.3.6) 

.  ,11  d.,11  1 


f  is  an  m  vector 
i 

H  is  an  mxm  matrix 

G.  is  a  mxn  matrix  and 
i 

K.  is  a  nxm  matrix. 

l 

In  order  to  solve  the  integral  equations  (5.3.5)  or  (5.3.6)  we  must 
make  use  of  the  fact  that  if  it  has  a  solution,  then  h(t)  must  be  a  linear 
combination  of  all  the  functions  in  the  equation.  We  w< 1 1  express  this  fact 
as  equation  (5.3.7).  Then  we  will  insert  equation  (5.3.7)  into  (5.3.6)  and 
will  get  a  set  of  equations  which  contain  the  vector  c  and  a  new  vector  b  of 
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coefficients  in  (5.3.7)  but  the  integrals  no  longer  contain  an  unknown  variable. 

Consequently,  the  integrations  can  be  calculated  initially  and  then 
for  a  given  c  one  can  solve  the  matrix  equation  for  the  vector  b.  Then  find 
the  vector  c  which  maximizes  (5.3.10).  Note  that  this  maximization  will  be  an 
iterative  process,  and  for  each  new  value  of  c,  the  matrix  equation  must  be 
solved  for  b. 

N  N 

An  element  of  the  T,  c.f  matrix  is  denoted  [  V.  c  f  ]  . 

i=l  11  i=l 


Let 


N 


N  n 


'j  C1^1cifi3j/hjj  + 


(5.3.7) 


Put  this  into  equation  (5.3.6)  and  set  the  coefficients  of  gpjq*  0« 


Thus 


m  T 


:b  g.  +  c  g  .  E  P  k  (T)h  (r)dr  =  0 

p  pq  pjq  p  pjq  S=1  J0  pqs  s 


(5.3.8) 


m  T 


n  N 


pq 


+  E  J*  k  {[  E  C.f.]  /h  +  £  E  ckbkf8ksl}  =  0  (5  3-9> 

S=1  J  Pqs  i-!  1  1  5  ss  jj.i  k=l  k  ki  ks£ 


o 

N  m 

T 

b  +  E  c.  E 

r  k  i 

i-l  is=l 

Jo  P<ls 

N  n 

m  ' 

+  E  ck  Z  bkf 
k-1  £- 1  1 

E  ■f, 

s=  1  ( 

(5.3.10) 


Now  if  b  is  written  as  a  single  vector,  we  have 

pq 

b  +  y  +  Ab  =  0;  b  =  -[I  +  A]  *y. 


From  examining  equation  (5.3.10)  we  have  [n-N*m]  [n>N+l]  integrations  to  solve. 


30 

It  is  necessary  to  maximize  J(u,c)  with  respect  to  c;  however,  it  is  necessary 
to  solve  the  above  number  of  integrations  only  once  and  then  a  new  value  of  b 
and,  consequently,  h  can  be  found  using  equations  (5.3.10)  and  (5."  7). 

It  may  be  noted  that  for  a  problem  where  the  order  of  the  system 
is  4,  the  number  of  inputs  is  2  and  R(uo#e)  =  N  =  10,  fnN-m]  [r-Nfl]  =  3280. 

However,  if  the  method  employing  the  Riccati  equation  is  employed, 

2 

(n-N)  /2  *  800  simultaneous  differential  equations  must  be  solved  Experience 
shows  that  it  is  by  far  easier  to  solve  thi  3280  integrations.  Also,  the  800 
differential  equations  must  be  solved  repeatedly  as  J  is  maximized  with 
respect  to  c. 

As  stated  in  Chapter  4,  a  new  point  u^  is  found  and  the  process  is 
repeated  until  the  distance  to  the  minimum  is  below  some  e. 
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6.  POTENTIAL  DIFFICULTIES  AND  LIMITATIONS  OF  THE  ALGORITHMS 

In  order  to  take  a  reasonably  large  step  size  with  either  the 

gradient  or  second  order  methods,  e  in  R(u,e)  must  be  large  enough  so  that 

R(u,e)  contains  all  points  where  the  max  may  occur  within  a  sphere  at  u  and 

a  radius  equal  to  the  step  size.  And,  clearly,  near  the  min-max  solution  at 

least  all  points  which  maximize  the  original  function  and  all  other  points 

which  are  very  close  must  be  considered. 

In  order  for  the  algorithms  to  be  useful,  it  is  assumed  that  a 

relatively  small  number  of  points  will  solve  the  max  problem  at  the  min-max 

solution,  i.e.,  that  P(u  )  is  small  where  u  is  the  rain-max  solution.  The 

o  o 

question  remains  whether  this  is  a  valid  assumption. 

6.1  Some  Reasons  Why  P(uq)  Maybe  Large 

Consider  a  min-max  problem  in  a  finite  demensional  space  where  the 
minimizing  vector  consists  of  one  component.  Experience  has  shown  that  the 
min-max  may  occur  at  two  intersecting  lines.  It  is  not  likely  for  more  to 
intersect,  but  two  is  very  probable. 

Next  consider  the  case  in  which  the  minimizing  vector  consists  of 
two  components  In  this  case  three  surfaces  very  likely  intersect  at  the 
min-max.  Continuing  this  reasoning,  where  the  minimizing  space  is  of  order  n, 
the  n+1  points  may  maximize  the  function  at  the  min-max. 

Likewise,  if  the  minimizing  variable  consists  of  an  infinite  number 
of  components,  then  an  inf ini te  number  of  points  may  maximize  the  function  at 
the  min-max.  Now  if  the  max  space  consists  of  a  finite  number  of  points,  then 
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it  is  > ery  likely  that  a  large  percentage,  if  not  all,  may  maximize  the  function 


at  the  min-max. 


Now  consider  the  problem  x  =  Ax  +  Bu  where  some  elements  of  A  may 


lie  within  a  given  region  and 

T 

J  =  x' (T)Qx(T)  +  J  u'(T)Hu(T)d7. 
t 

o 


ii 

si 


Now  the  max  space  consists  of  a  non-countable  number  of  points.  If 
the  minimizing  variable  consists  of  a  non-countable  number  of  elements  as  we 
have  considered,  then  it  is  very  likely  that  at  the  min-max  solution  a  non- 
countable  number  of  points  from  the  max  space  may  occur. 

From  this  discussion  it  is  now  clear  that  our  original  assumption  , 
that  the  number  of  points  maximizing  the  function  at  the  min-max  is  a  small 
number,  may  not  be  reasonable. 
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Three  cases  were  considered:  One  where  there  were  125  points  in  the  max  space, 
one  where  24  points  were  considered  and  one  where  only  eight  points  were  con¬ 
sidered.  The  method  used  for  finding  the  gradient  was  the  one  presented  at 
the  beginning  of  Section  5.  The  differential  equations  were  solved  using  the 
Runge-Kutta-Gill  method.  The  closer  one  approached  the  min-max  solution,  the 
greater  degree  of  accuracy  was  necessary  to  find  a  point  which  decreased  the 
function.  With  one  level  of  accuracy  a  direction  would  be  given,  but  even  with 
an  extremely  small  step  size  the  function  value  could  not  be  decreased.  If 
the  accuracy  was  then  increased  ten-fold,  a  direction  could  be  found  where  the 
function  value  could  be  decreased  from  10  to  30  percent  in  one  iteration. 

This  indicates  that  the  solution  has  very  steep  V  shaped  ridges  as 
you  approach  th »  min-max  solution. 

The  input  was  divided  into  26  parts  and  in  solving  the  differential 
equations  each  interval  was  further  divided.  In  order  to  increase  accuracy, 
each  of  the  25  intervals  was  divided  into  a  greater  number  of  sub-intervals. 

It  may  have  been  desirable  to  increase  the  original  number  of  points  in  the 
control.  Possibly  this  is  a  limitation  of  the  gradient  algorithm  and  it 
may  be  necessary  to  switch  to  the  second  order  method.. 

A  degree  of  accuracy  was  not  obtained  with  the  gradient  method  to 
determine  what  percentages  of  points  in  the  max  space  were  in  P(u*)  where  u* 
was  the  min-max  solution. 

However,  one  thing  is  clear.  Even  if  P'u* »  does  not  contain  the 
entire  max  space  or  even  a  large  part  of  it,  R(un,«)  must  be  a  large  propor¬ 
tion  of  the  max  space  or,  preferably,  all  of  it  in  order  to  minimize  the 
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function  in  reasonable  time.  In  the  program  that  was  constructed,  R(un»*) 
could  only  accommodate  ten  values.  When  the  max  space  consisted  of  125  points, 
the  program  took  a  considerable  amount  of  computer  time.  The  reason  for  this 
is  mainly  because  during  the  maximizing  step,  the  full  125  points  had  to  be 
expanded,  and  when  it  became  necessary  to  increase  the  accuracy  for  solving 
the  differential  equations,  the  program  was  terminated. 

The  program  was  terminated  before  y*,  the  minimal  element  of  C^(u), 
became  large.  This  indicates  that  either  u^  was  not  close  to  the  min-max  or 
*  was  not  large  enough  so  that  R(un>e)  contained  all  the  points  in  P(u*). 
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7.  THE  ALGORITHMS  IN  DETAIL 

In  this  chapter  we  will  present  the  detailed  explanation  of  the 
gradient  and  Newton-Raphson  algorithms,  and  how  they  are  used  together  in 
solving  the  min-max  problem  in  function  spaces. 

7.1  The  Size  of  R(un»e) 

As  we  have  seen,  the  max  function  has  steep  gradients.  Therefore, 

it  is  not  desirable  to  pick  an  g  in  R(un>e)  and  find  the  number  of  points  in 

it  since  it  is  too  difficult  to  tell  how  large  e  should  be.  Consequently, 

R(Un,e)  should  be  as  large  as  the  following  considerations  allow.  The 

algorithm  which  finds  the  minimum  Y®C.  (u  )  becomes  very  time  consuming  as  the 

1C  n 

number  of  elements  in  the  vector  y  increases.  (The  number  of  elements  in  y 
equals  the  number  in  R(un>e).)  There  is  some  optimal  number  of  elements 
R(un,c)  should  have  which  depends  on  the  total  number  of  elements  in  the 
max  space  and  the  nature  of  the  problem. 

In  the  second  order  method  we  have  the  same  problem,  only  to  a 
much  greater  degree.  There  the  procedure  which  finds  the  saddle  point  solution 
of  J v u , c ]■  becomes  exponentially  more  complex  with  each  new  element  in  the 
vector  c„  As  the  reader  will  recall,  this  is  where  either  a  large  number  of 
differential  equations  must  be  solved  repeatedly  or  an  even  larger  number  of 
integral  equations  must  be  solved  once. 

The  purpose  of  this  discussion,  then,  is  to  use  the  gradient  method 
with  R(un>t)  having  a  large  number  of  elements  until  one  is  near  the  solution 
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and  then  use  the  quadratic  method  with  R(un,e)  being  small.  This  is  assuming 

that  at  the  min-max  solution  P(u*)  has  no  more  elements  than  the  number  we 

used  for  R(u  ,e).  That  is,  R(u  ,e)  must  contain  at  least  as  many  elements  as 
n  n 

P(u*)  and  if  it  is  a  large  number,  then  the  problem  for  all  practical  purposes 
cannot  be  solved.  At  large  distances  from  the  min-max,  it  is  not  necessary  to 
have  a  very  precise  figure  for  the  gradient.  It  is  desirable  to  use  a  crude 
approximation  of  the  gradient  until  you  are  near  the  min-max  when  a  greater 
degree  of  accuracy  is  necessary. 

The  gradient  algorithm  can  be  used  even  if  J  is  not  quadratic.  The 

example  presented  in  Chapters  1  and  5  is  much  more  restricted  than  is  necessary 

however,  to  use  the  quadratic  method  which  employs  the  solution  of  integral 

equations  to  solve  the  saddle  point  problem,  it  is  necessary  to  have  a  very 

restricted  problem.  It  is  necessary  that  the  matrix  H  in  J  =  x' (t)Qx(t) 

T 

+  fu  '  (t)Hu(t)dt  be  diagonal.  If  it  is  not,  then  the  set  of  integral  equations 

4 

to  be  solved  would  be  much  more  complex. 

7.2  The  Gradient  Algorithm 

1.  Choose  M^,  the  number  of  elements  in  R(un>e),  the  initial  step 
size  a ,  the  initial  point  in  the  minimizing  space  uq» and  K 
explained  in  step  5. 

2.  When  u  =  u  calculate  the  value  of  the  function  J  at  all  N  points 
n 

in  the  max  space.  Store  the  largest  points.  See  Note  1  in 
Section  7.4. 

Calculate  the  gradients  for  the  points.  See  Note  2. 


3. 
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Calculate  the  by  matrix  G  explained  in  Section  3.4  or  Note  3. 

Compute  y*  the  minimum  element  in  C*(un).  See  Note  4.  If  |jy*||  >  K, 

some  large  number,  then  we  are  getting  near  the  min-max.  If  e  in 

R(u  ,e)  is  sufficiently  small,  we  are  at  the  min-max.  This  can  be 
n 

determined  by  noting  the  difference  for  the  greatest  and  least  value 
of  J  for  the  values  of  J  in  R(un>e).  This  difference  is  equal  to 
c.  If  e  is  not  small  enough  to  terminate, then  decrease  by  one 
element.  Now  if  where  is  the  number  of  points  in  R(un>s) 

for  the  Newton-Raphson  method,  then  transfer  to  the  Newton-Raphson 
algorithm.  If  then  calculate  a  new  matrix  G.  Note  that  it 

should  be  necessary  only  to  delete  the  last  row  and  column.  Find  a 
new  value  of  y* .  Now  if  ||y*||  <  K  proceed  to  Step  6. 

Let  un+l  '  "n  +  «  n^T  • 

If  F(u  .)  <  F(u  )  return  to  Step  2. 
n+l  n 

If  F(u  >  F(u  )  then  let  a  =  4  cr  and  if  or  >  or  return  to  Step  6. 

n+  i  —  n  i  ~  m 

Observe  that  it  is  necessary  to  store  the  old  value  of  u^  until  the 
condition  in  Step  7  is  satisfied. 

If  a  is  very  small,  i„e.  less  than  some  a  ,  the  convergence  is  too 

m 

slow.  It  may  be  that  the  value  of  y  is  not  accurate  enough  and  this 
may  result  from  inaccurate  gradient  calculations.  Therefore,  in¬ 
crease  the  number  of  points  ND  as  explained  in  Note  1,  Section  7.4. 
If  this  does  not  help,  increase  NO  and  ND  and  increase  the  accuracy 

of  the  procedure  minimizing  u  G  u  -  u.  If  none  of  the  above  helps, 

then  switch  to  the  Newton-Raphson  algorithm. 


There  are  two  conditions  where  the  program  should  be  transferred 
to  the  Newton-Raphson  algorithm.  The  most  desirable  one  is  where  ||y*|| 
approaches  infinity  repeatedly  as  the  number  of  elements  in  R(un,c)  is  decreased 
until  the  number  in  R(un>e)  is  quite  small. 

The  maximum  number  in  R(u  ,e)  will  be  denoted  M_  and  should  be  much 

n  L 

less  than  M^  otherwise  the  computation  time  will  be  excessive. 

The  second  case  where  the  program  transfers  to  the  second  order 
algorithm  is  when  the  gradient  method  produces  a  step  which  is  too  small  to 
be  effective. 

From  the  experience  the  author  has  gained  from  working  with  the 
gradient  algorithm,  this  will  generally  be  the  case.  It  may  be  that  |)y*|| 
never  became  very  large  and  was  never  decreased.  If  this  is  the  case,  it 
may  truly  take  excessive  computer  time  to  solve  the  problem.  One  can  only 
experiment  in  order  to  know  how  large  to  make  ,  because  if  it  is  too  small, 
you  will  have  to  decrease  the  step  size  more  than  what  the  algorithm  indicates, 
and  if  it  is  too  large,  the  computation  will  be  excessive. 

We  will  now  present  the  Newton-Raphson  algorithm  employing  the 
method  of  integral  equations  to  solve  the  example  presented  in  Chapters  1  and  5. 

1.  Choose  the  size  of  M2  and  use  the  first  value  of  u  from  the  gradient 
algorithm. 

With  u  =  u  calculate  the  value  'f  the  function  J  at  all  N  points  in 
n 

the  max  space.  Store  the  M2  largest  points.  If  the  difference 


2. 
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between  the  largest  and  smallest  value  of  J  is  less  than  some  small 
quantity,  then  increase  or  terminate  because  it  would  not  be 
possible  to  improve  the  solution. 

3.  Find  the  transition  matrix  at  NO  points  in  time. 

4.  Perform  integrations  indicated  in  Step  5. 

5.  Form  the  equation  b  +  y  +  Ab  =  0  and  solve  for  b  for  a  given  c. 

We  assume  that  this  can  be  accomplished.  It  is  not  necessary  that 
there  be  a  unique  solution. 

The  vector  b  has  dimension  N*n  where  N  =  and  n  is  the  order  of 
the  system.  We  will  show  how  to  construct  y  and  A  above: 

Form  the  vector  f  =  x^ (T)Q$^  (T , t)  at  various  points  of  time,  and 
where  $i(T,T)  is  the  transition  matrix  associated  with  the  ith  value 
in  Riu^.e).  Form  the  matrices  =  B'$i(T,t)Q  where  Gi  is  a  set  of 
0"n  matrices  for  various  points  of  time.  And  form  the  matrices 
K.  -  l.vT.TtB  --.  ere  K.  is  a  set  of  n-m  matrices  for  various  points 
of  time  . 


The  notation  K  or  G  wilL  represent  the  qth  row  and  Sth  column 
pqs  pqs 

of  th».  matrix  K  or  G  .  The  £th  element  of  b  and  jfcth  row  of 
P  P 

b+y+AB-Ois  obtained  when  £  =  Cp-1)  a  +  q  where  p  is  the  pth 


tlerne"!  in  M„  and  q  is  the  row  of  R  .  Therefore 
2  P 


N  n  m 


E 

r=l 


£ 

s= 1 


r  K  (T)g.  (t ) d T  =  o 

Jq  pqs  °ksr 
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The  notation  [f^(T)]s  indicates  the  sth  element  of  the  vector  and 

H  is  the  sth  diagonal  element  of  H.  Therefore 
s  s 


N  m  T 


y  =  Sc.  £  f  K  (r)[f.(T)]  /H  dT 
1  i=l  1  s=l  J0  PqS  1  S  SS 

and  if  0  =  (k-1)  n  +  r,  then 
m  T 

A  =  cu  £  f  K  (t ) g,  (r)dT. 

SLo  k  ,  J  pqs  ksr 


s=i  o 


With  the  vector  b  found  in  Step  5  find 

h  "  rifiCifiVHjj  + 

Now  form 


Ei  =  J  r2xi(t)Q$i(T,r)Bh  +  2u'q  Hh]dT 

Tt  X 

+  f  f  h'(t)B'$. (T,t)Q$. (T,T)Bh(T)dtdT  +  r  h’(t)Hh(t)dt 

JoJo  1  1  Jo 

N  N 

Find  the  C  which  maximizes  £  C.E-  with  0<c<l  and  £  c. 

.,ii  -  -  .  ,  i 

i=l  i=l 


=  1 


9.  Return  to  Step  5  to  find  a  new  b.  Repeat  this  loop  until  the  vectors 

c  and  b  no  longer  change.  At  this  stage  the  saddle  point  has  been 

found.  Note  that  it  is  not  necessary  to  find  c  at  a  high  degree  of 

accurancy  in  Step  8  at  the  early  stages  of  the  loop, 

10.  From  the  final  value  of  b  find  h(t)  as  in  Step  6,  and  form  un+^ 

=  u  +  h(t) . 
n 

If  F(un+^)  <  F(u^)  proceed  to  Step  12.  If  F(un+^)  r?  F(u^)  t^1en 
let  h  =  —  h  and  return  to  Step  9. 


11. 
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12.  If  |jh(t)||  >  eM>  the  step  was  large  and  it  is  uncertain  that  we  are 
now  at  the  min-max.  Return  to  Step  2. 

If  ||h(t)||  <  eMj  then  we  are  at  the  min-max  if  e  in  R(un>6)  is  small. 
If  e  is  large  then  reduce  e  and,  consequently,  and  return  to 
Step  2. 

7.4  The  Method  of  Calculating  Various  Quantities. 

This  section  will  be  comprised  of  notes  which  show  how  to  calculate 
the  various  quantities  presented  in  the  algorithms.  We  will  deal  implicitly 
with  the  example  presented  in  Chapters  1  and  5. 

Note  #: 

1.  To  calculate  the  function: 

T 

J  =  x!(T)Qx  (T)  +  f  u'(t)Hu(t)dt 
l  l  J  0  n  n 

x.  =  A.  x  +  Bu  x  (o)  =  x 
l  l  n  o 

We  have  a  value  of  u  at  NO  number  of  points  between  0  and  T. 

n 

Further  divide  each  interval  of  time  by  ND  number  of  points.  Use 
the  Runge-Kutta-Gill  method  for  solving  for  x(t)  at  each  of  the 
NO  x  ND  points  and  store  x(t)  at  each  of  the  major  NO  points.  Now 
it  will  be  necessary  to  use  x(t)  at  the  NO  points  to  calculate 
the  gradient;  however,  we  do  not  want  to  store  x(t)  for  all  N  points 
in  the  max  space.  Rather,  store  x(t)  for  the  first  points 
calculated  and  the  value  of  J  for  each  of  the  points.  Then  if 
succeeding  values  of  J  are  larger  than  any  previous  value,  then 
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replace  the  new  set  of  points  x(t)  with  the  set  x(t)  for  which  J 
took  the  least  value.  In  this  way,  after  finding  the  value  of  J 
for  all  N  points  in  the  max  space,  we  will  have  the  set  of  x(t)  for 
the  largest  values  of  J.  Note  that  it  is  necessary  to  calculate 


2. 


3. 


L 


u'(t)Hu  (t)dt  only  once  since  it  is  not  a  function  of  A. 
n  n 


Calculation  of  gradients.  Use.  the  same  subroutine  for  solving 

the  equation  \  =  2Qxi(t)  -  \  with  \(.rf)  =  0  as  in  Note  1.  The 

gradient  of  K.  =  -  Hu  (t)  -B' 
i  n 

Calcuidte  matrix  G.  An  element  of  G  denoted  G. .  is  equal  to 


1 

4 


found  in  Note  2 


4. 


Find  v*  the  min  of  C.e(u  ).  Minimize  u'Gu  -  u  with  u  >  0  and  G  as 

k  n  ' " 

in  Note  3.  If  it  is  possible  to  find  a  u  which  minimizes  the  above 


then 


Y*  =  -  Z  u.p.(t).  Rosenbrock's  rotating  coordinate  system 

j  =  l  J  J 


method  was  used  in  this  minimization.  If  a  minimum  is  not  found 
after  a  given  number  of  iterations,  calculate  y*  as  above  anyway. 
If  ||Y*||  is  less  than  some  K,  then  continue  the  iterative  process. 
If  it  is  greater  than  K,  return  tc  main  program. 
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8.  CONCLUSION 

This  work  presents  the  first  attempt  at  solving  the  min-max  problem 
in  function  spaces  where  the  max  space  is  larger  than  a  few  elements.  Initially, 
it  was  hoped  that  the  elimination  method  could  be  used,  but  as  explained  in  the 
Introduction,  this  was  not  possible.  Two  algorithms  are  presented;  namely,  the 
Newton-Raphson  and  gradient.  The  gradient  method  was  programmed  and  the  results 
are  presented.  The  gradient  method  was  successful  in  the  range  that  was  anti¬ 
cipated-  when  one  is  far  from  the  solution. 

The  best  method  of  solving  the  saddle  point  problem  which  is  one 
part  of  the  Newton-Raphson  method  is  the  method  involving  integral  equations. 

It  is  quite  clear  that  the  Newton-Raphson  algorithm  should  be  used 
in  conjunction  with  the  gradient  method.  The  reason  for  this  is  because  if  one 
is  far  from  the  solution,  then  one  iteration  of  the  Newton-Raphson  method  will 
not  locate  a  point  closer  to  the  solution  than  a  single  iteration  of  the 
gradient  algorithm;  however,  one  iteration  of  the  gradient  method  is  consider¬ 
ably  shorter  than  one  of  the  second  order  method.  However,  as  could  be  expected 
convergence  with  the  gradient  method  became  slow  near  the  minimum. 

Also  presented  in  this  paper  are  some  arguments  why  P(y)  (all  points 
from,  the  max  space  which  maximize  the  function  at  y)  may  be  large  compared  to 
the  total  number  of  points  in  the  max  space.  It  was  shown  that  if  this  happens, 
then  the  solution  to  the  min-max  problem  is  exceedingly  difficult. 
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Ciadiuni.  Algorithm 


Choose 

:(1  ‘  |{(Ve> 

0 »  *  the  initial  step  size 

ti(j  -  the  initial  point 

K  -  the  number  where  If  |jy*||  2  K 

v  then  it  is  assumed  [|Y*|i  is  large 

=  U  -♦O' 

n+  l 

n 

V* 

Increase  Accuracy 
in  Computing  Gradient 
or  Transfer  to 
Nevton-Raphson 
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Newton  Raphson  Algorithm 
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